Combining Acoustic and Language in Recognition
نویسندگان
چکیده
This paper reports on emotion recognition using both acoustic and language information in spoken utterances. So far, most previous efforts have focused on emotion recognition using acoustic correlates although it is well known that language information also conveys emotions. For capturing emotional information at the language level, we introduce the information-theoretic notion of ’emotional salience’. For acoustic information, linear discriminant classifiers and k-nearest neighborhood classifiers were used in the emotion classification. The combination of acoustic and linguistic information is posed as a data fusion problem to obtain the combined decision. Results using spoken dialog data obtained from a telephone-based human-machine interaction application show that combining acoustic and language information improves negative emotion classification by 45.7% (linear discriminant classifier used for acoustic information) and 32.9%, respectively, over using only acoustic and language information.
منابع مشابه
Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملCombining acoustic and language information for emotion recognition
This paper reports on emotion recognition using both acoustic and language information in spoken utterances. So far, most previous efforts have focused on emotion recognition using acoustic correlates although it is well known that language information also conveys emotions. For capturing emotional information at the language level, we introduce the information-theoretic notion of ’emotional sa...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملRapid acoustic model development using Gaussian mixture clustering and language adaptation
This work presents techniques for improved cross-language transfer of speech recognition systems to new, previously undeveloped, languages. Such techniques are particularly useful for target languages where minimal amounts of training data are available. We describe a novel method to produce a language-independent system by combining acoustic models from a number of source languages. This inter...
متن کامل